Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.430
Filtrar
1.
Bioscience ; 74(3): 169-186, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38560620

RESUMO

The impact of preserved museum specimens is transforming and increasing by three-dimensional (3D) imaging that creates high-fidelity online digital specimens. Through examples from the openVertebrate (oVert) Thematic Collections Network, we describe how we created a digitization community dedicated to the shared vision of making 3D data of specimens available and the impact of these data on a broad audience of scientists, students, teachers, artists, and more. High-fidelity digital 3D models allow people from multiple communities to simultaneously access and use scientific specimens. Based on our multiyear, multi-institution project, we identify significant technological and social hurdles that remain for fully realizing the potential impact of digital 3D specimens.

2.
J Dermatolog Treat ; 35(1): 2338280, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38569598

RESUMO

For individuals with atopic dermatitis (AD), interpreting scientific papers that present clinical outcomes including the Eczema Area and Severity Index (EASI) and Investigators Global Assessment may be difficult. When compared to tabulated data and graphs, images from before and after treatment are often far more meaningful to these patients that ultimately will be candidates for the treatment. This systematic review focused on determining the frequency of clinical image sharing in AD research.Conducted in accordance with PRISMA guidelines, the review concentrated on randomized controlled trials that investigated predefined and available systemic treatments for AD. The search was performed in the MEDLINE database for studies published from the inception until 21 December 2023.The review included 60 studies, encompassing 17,799 randomized patients. Across these studies, 16 images representing 6 patients were shared in the manuscripts, leading to a sharing rate of 0.3‰.The almost missing inclusion of patient images in clinical trial publications hinders patient understanding. Adding images to scientific manuscripts could significantly improve patients' comprehension of potential treatment outcomes. This review highlights the need for authors, the pharmaceutical industry, study sponsors, and publishers to enhance and promote patient information through increased use of visual data.


Assuntos
Dermatite Atópica , Humanos , Dermatite Atópica/tratamento farmacológico , Estudos Prospectivos , Ensaios Clínicos Controlados Aleatórios como Assunto , Resultado do Tratamento , Administração Cutânea , Índice de Gravidade de Doença
3.
Int J Eat Disord ; 2024 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-38597344

RESUMO

OBJECTIVE: To provide a brief overview of artificial intelligence (AI) application within the field of eating disorders (EDs) and propose focused solutions for research. METHOD: An overview and summary of AI application pertinent to EDs with focus on AI's ability to address issues relating to data sharing and pooling (and associated privacy concerns), data augmentation, as well as bias within datasets is provided. RESULTS: In addition to clinical applications, AI can utilize useful tools to help combat commonly encountered challenges in ED research, including issues relating to low prevalence of specific subpopulations of patients, small overall sample sizes, and bias within datasets. DISCUSSION: There is tremendous potential to embed and utilize various facets of artificial intelligence (AI) to help improve our understanding of EDs and further evaluate and investigate questions that ultimately seek to improve outcomes. Beyond the technology, issues relating to regulation of AI, establishing ethical guidelines for its application, and the trust of providers and patients are all needed for ultimate adoption and acceptance into ED practice. PUBLIC SIGNIFICANCE: Artificial intelligence (AI) offers a promise of significant potential within the realm of eating disorders (EDs) and encompasses a broad set of techniques that offer utility in various facets of ED research and by extension delivery of clinical care. Beyond the technology, issues relating to regulation, establishing ethical guidelines for application, and the trust of providers and patients are needed for the ultimate adoption and acceptance of AI into ED practice.

4.
bioRxiv ; 2024 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-38562736

RESUMO

The tree-like morphology of neurons and glia is a key cellular determinant of circuit connectivity and metabolic function in the nervous system of essentially all animals. To elucidate the contribution of specific cell types to both physiological and pathological brain states, it is important to access detailed neuroanatomy data for quantitative analysis and computational modeling. NeuroMorpho.Org is the largest online collection of freely available digital neural reconstructions and related metadata and is continuously updated with new uploads. Earlier in the project, we released multiple datasets together yearly, but this process caused an average delay of several months in making the data public. Moreover, in the past 5 years, >80% of invited authors agreed to share their data with the community via NeuroMorpho.Org, up from <20% in the first 5 years of the project. In the same period, the average number of reconstructions per publication increased 600%, creating the need for automatic processing to release more reconstructions in less time. The progressive automation of our pipeline enabled the transition to agile releases of individual datasets as soon as they are ready. The overall time from data identification to public sharing decreased by 63.7%; 78% of the datasets are now released in less than 3 months with an average workflow duration below 40 days. Furthermore, the mean processing time per reconstruction dropped from 3 hours to 2 minutes. With these continuous improvements, NeuroMorpho.Org strives to forge a positive culture of open data. Most importantly, the new, original research enabled through reuse of datasets across the world has a multiplicative effect on science discovery, benefiting both authors and users.

6.
JAMIA Open ; 7(2): ooae025, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38617994

RESUMO

Objectives: A data commons is a software platform for managing, curating, analyzing, and sharing data with a community. The Pandemic Response Commons (PRC) is a data commons designed to provide a data platform for researchers studying an epidemic or pandemic. Methods: The PRC was developed using the open source Gen3 data platform and is based upon consortium, data, and platform agreements developed by the not-for-profit Open Commons Consortium. A formal consortium of Chicagoland area organizations was formed to develop and operate the PRC. Results: The consortium developed a general PRC and an instance of it for the Chicagoland region called the Chicagoland COVID-19 Commons. A Gen3 data platform was set up and operated with policies, procedures, and controls for a NIST SP 800-53 revision 4 Moderate system. A consensus data model for the commons was developed, and a variety of datasets were curated, harmonized and ingested, including statistical summary data about COVID cases, patient level clinical data, and SARS-CoV-2 viral variant data. Discussion and conclusions: Given the various legal and data agreements required to operate a data commons, a PRC is designed to be in place and operating at a low level prior to the occurrence of an epidemic, with the activities increasing as required during an epidemic. A regional instance of a PRC can also be part of a broader data ecosystem or data mesh consisting of multiple regional commons supporting pandemic response through sharing regional data.

7.
Cureus ; 16(3): e56193, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38618347

RESUMO

In the ever-evolving landscape of biomedical research and publishing, the International Committee of Medical Journal Editors recommendations serve as a critical framework for maintaining ethical standards. By providing a framework that adapts to technological advancements, the International Committee of Medical Journal Editors recommendations actively shape responsible and transparent practices, ensuring the integrity of scientific inquiry and fostering global collaboration in the ever-evolving landscape of medical publishing. This editorial delves into key aspects of the latest changes in the International Committee of Medical Journal Editors recommendations, focusing on authorship, conflict of interest disclosure, data sharing and reproducibility, medical publishing and carbon emissions, the use of artificial intelligence, and the challenges posed by predatory journals within the realm of open access. It emphasizes the importance of new recommendations, which represent a beacon of ethical guidance in the ever-evolving domain of biomedical research and publishing.

8.
JMIR Form Res ; 8: e53241, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38648097

RESUMO

BACKGROUND: Electronic health records are a valuable source of patient information that must be properly deidentified before being shared with researchers. This process requires expertise and time. In addition, synthetic data have considerably reduced the restrictions on the use and sharing of real data, allowing researchers to access it more rapidly with far fewer privacy constraints. Therefore, there has been a growing interest in establishing a method to generate synthetic data that protects patients' privacy while properly reflecting the data. OBJECTIVE: This study aims to develop and validate a model that generates valuable synthetic longitudinal health data while protecting the privacy of the patients whose data are collected. METHODS: We investigated the best model for generating synthetic health data, with a focus on longitudinal observations. We developed a generative model that relies on the generalized canonical polyadic (GCP) tensor decomposition. This model also involves sampling from a latent factor matrix of GCP decomposition, which contains patient factors, using sequential decision trees, copula, and Hamiltonian Monte Carlo methods. We applied the proposed model to samples from the MIMIC-III (version 1.4) data set. Numerous analyses and experiments were conducted with different data structures and scenarios. We assessed the similarity between our synthetic data and the real data by conducting utility assessments. These assessments evaluate the structure and general patterns present in the data, such as dependency structure, descriptive statistics, and marginal distributions. Regarding privacy disclosure, our model preserves privacy by preventing the direct sharing of patient information and eliminating the one-to-one link between the observed and model tensor records. This was achieved by simulating and modeling a latent factor matrix of GCP decomposition associated with patients. RESULTS: The findings show that our model is a promising method for generating synthetic longitudinal health data that is similar enough to real data. It can preserve the utility and privacy of the original data while also handling various data structures and scenarios. In certain experiments, all simulation methods used in the model produced the same high level of performance. Our model is also capable of addressing the challenge of sampling patients from electronic health records. This means that we can simulate a variety of patients in the synthetic data set, which may differ in number from the patients in the original data. CONCLUSIONS: We have presented a generative model for producing synthetic longitudinal health data. The model is formulated by applying the GCP tensor decomposition. We have provided 3 approaches for the synthesis and simulation of a latent factor matrix following the process of factorization. In brief, we have reduced the challenge of synthesizing massive longitudinal health data to synthesizing a nonlongitudinal and significantly smaller data set.

9.
JMIR Med Inform ; 12: e49646, 2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38654577

RESUMO

Background: The SARS-CoV-2 pandemic has demonstrated once again that rapid collaborative research is essential for the future of biomedicine. Large research networks are needed to collect, share, and reuse data and biosamples to generate collaborative evidence. However, setting up such networks is often complex and time-consuming, as common tools and policies are needed to ensure interoperability and the required flows of data and samples, especially for handling personal data and the associated data protection issues. In biomedical research, pseudonymization detaches directly identifying details from biomedical data and biosamples and connects them using secure identifiers, the so-called pseudonyms. This protects privacy by design but allows the necessary linkage and reidentification. Objective: Although pseudonymization is used in almost every biomedical study, there are currently no pseudonymization tools that can be rapidly deployed across many institutions. Moreover, using centralized services is often not possible, for example, when data are reused and consent for this type of data processing is lacking. We present the ORCHESTRA Pseudonymization Tool (OPT), developed under the umbrella of the ORCHESTRA consortium, which faced exactly these challenges when it came to rapidly establishing a large-scale research network in the context of the rapid pandemic response in Europe. Methods: To overcome challenges caused by the heterogeneity of IT infrastructures across institutions, the OPT was developed based on programmable runtime environments available at practically every institution: office suites. The software is highly configurable and provides many features, from subject and biosample registration to record linkage and the printing of machine-readable codes for labeling biosample tubes. Special care has been taken to ensure that the algorithms implemented are efficient so that the OPT can be used to pseudonymize large data sets, which we demonstrate through a comprehensive evaluation. Results: The OPT is available for Microsoft Office and LibreOffice, so it can be deployed on Windows, Linux, and MacOS. It provides multiuser support and is configurable to meet the needs of different types of research projects. Within the ORCHESTRA research network, the OPT has been successfully deployed at 13 institutions in 11 countries in Europe and beyond. As of June 2023, the software manages data about more than 30,000 subjects and 15,000 biosamples. Over 10,000 labels have been printed. The results of our experimental evaluation show that the OPT offers practical response times for all major functionalities, pseudonymizing 100,000 subjects in 10 seconds using Microsoft Excel and in 54 seconds using LibreOffice. Conclusions: Innovative solutions are needed to make the process of establishing large research networks more efficient. The OPT, which leverages the runtime environment of common office suites, can be used to rapidly deploy pseudonymization and biosample management capabilities across research networks. The tool is highly configurable and available as open-source software.

10.
JMIR Public Health Surveill ; 10: e51880, 2024 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-38656780

RESUMO

During public health crises, the significance of rapid data sharing cannot be overstated. In attempts to accelerate COVID-19 pandemic responses, discussions within society and scholarly research have focused on data sharing among health care providers, across government departments at different levels, and on an international scale. A lesser-addressed yet equally important approach to sharing data during the COVID-19 pandemic and other crises involves cross-sector collaboration between government entities and academic researchers. Specifically, this refers to dedicated projects in which a government entity shares public health data with an academic research team for data analysis to receive data insights to inform policy. In this viewpoint, we identify and outline documented data sharing challenges in the context of COVID-19 and other public health crises, as well as broader crisis scenarios encompassing natural disasters and humanitarian emergencies. We then argue that government-academic data collaborations have the potential to alleviate these challenges, which should place them at the forefront of future research attention. In particular, for researchers, data collaborations with government entities should be considered part of the social infrastructure that bolsters their research efforts toward public health crisis response. Looking ahead, we propose a shift from ad hoc, intermittent collaborations to cultivating robust and enduring partnerships. Thus, we need to move beyond viewing government-academic data interactions as 1-time sharing events. Additionally, given the scarcity of scholarly exploration in this domain, we advocate for further investigation into the real-world practices and experiences related to sharing data from government sources with researchers during public health crises.


Assuntos
COVID-19 , Disseminação de Informação , Saúde Pública , Humanos , COVID-19/epidemiologia , Saúde Pública/tendências , Disseminação de Informação/métodos , Governo , Pandemias
11.
J Med Internet Res ; 26: e49445, 2024 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-38657232

RESUMO

BACKGROUND: Sharing data from clinical studies can accelerate scientific progress, improve transparency, and increase the potential for innovation and collaboration. However, privacy concerns remain a barrier to data sharing. Certain concerns, such as reidentification risk, can be addressed through the application of anonymization algorithms, whereby data are altered so that it is no longer reasonably related to a person. Yet, such alterations have the potential to influence the data set's statistical properties, such that the privacy-utility trade-off must be considered. This has been studied in theory, but evidence based on real-world individual-level clinical data is rare, and anonymization has not broadly been adopted in clinical practice. OBJECTIVE: The goal of this study is to contribute to a better understanding of anonymization in the real world by comprehensively evaluating the privacy-utility trade-off of differently anonymized data using data and scientific results from the German Chronic Kidney Disease (GCKD) study. METHODS: The GCKD data set extracted for this study consists of 5217 records and 70 variables. A 2-step procedure was followed to determine which variables constituted reidentification risks. To capture a large portion of the risk-utility space, we decided on risk thresholds ranging from 0.02 to 1. The data were then transformed via generalization and suppression, and the anonymization process was varied using a generic and a use case-specific configuration. To assess the utility of the anonymized GCKD data, general-purpose metrics (ie, data granularity and entropy), as well as use case-specific metrics (ie, reproducibility), were applied. Reproducibility was assessed by measuring the overlap of the 95% CI lengths between anonymized and original results. RESULTS: Reproducibility measured by 95% CI overlap was higher than utility obtained from general-purpose metrics. For example, granularity varied between 68.2% and 87.6%, and entropy varied between 25.5% and 46.2%, whereas the average 95% CI overlap was above 90% for all risk thresholds applied. A nonoverlapping 95% CI was detected in 6 estimates across all analyses, but the overwhelming majority of estimates exhibited an overlap over 50%. The use case-specific configuration outperformed the generic one in terms of actual utility (ie, reproducibility) at the same level of privacy. CONCLUSIONS: Our results illustrate the challenges that anonymization faces when aiming to support multiple likely and possibly competing uses, while use case-specific anonymization can provide greater utility. This aspect should be taken into account when evaluating the associated costs of anonymized data and attempting to maintain sufficiently high levels of privacy for anonymized data. TRIAL REGISTRATION: German Clinical Trials Register DRKS00003971; https://drks.de/search/en/trial/DRKS00003971. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.1093/ndt/gfr456.


Assuntos
Anonimização de Dados , Humanos , Insuficiência Renal Crônica/terapia , Disseminação de Informação/métodos , Algoritmos , Alemanha , Confidencialidade , Privacidade
12.
PeerJ Comput Sci ; 10: e1962, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38660153

RESUMO

Data sharing is increasingly important across various industries. However, issues such as data integrity verification during sharing, encryption key leakage, and difficulty sharing data between different user groups have been identified. To address these challenges, this study proposes a multi-group data sharing network model based on Consortium Blockchain and IPFS for P2P sharing. This model uses a dynamic key encryption algorithm to provide secure data sharing, avoiding the problems associated with existing data transmission techniques such as key cracking or data leakage due to low security and reliability. Additionally, the model establishes an IPFS network for users within the group, allowing for the generation of data probes to verify data integrity, and the use of the Fabric network to record log information and probe data related to data operations and encryption. Data owners retain full control over access to their data to ensure privacy and security. The experimental results show that the system proposed in this study has wide applicability.

13.
Clin Transl Oncol ; 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38635076

RESUMO

PURPOSE: This study has been focused on assessing the Open Science scenario of cancer research during the period 2011-2021, in terms of the derived scientific publications and raw data dissemination. METHODS: A cancer search equation was executed in the Science Citation Index-Expanded, collecting the papers signed by at least one Spanish institution. The same search strategy was performed in the Data Citation Index to describe dataset diffusion. RESULTS: 50,822 papers were recovered, 71% of which belong to first and second quartile journals. 59% of the articles were published in Open Access (OA) journals. The Open Access model and international collaboration positively conditioned the number of citations received. Among the most productive journals stood out Plos One, Cancers, and Clinical and Translational Oncology. 2693 genomics, proteomics and metabolomics datasets were retrieved, being Gene Expression Omnibus the favoured repository. CONCLUSIONS: There has been an increase in oncology publications in Open Access. Most were published in first quartile journals and received higher citations than non-Open Access articles, as well as when oncological investigation was performed between international research teams, being relevant in the context of Open Science. Genetic repositories have been the preferred for sharing oncology datasets. Further investigation of research and data sharing in oncology is needed, supported by stronger Open Science policies, to achieve better data sharing practices among three scientific main pillars: researchers, publishers, and scientific organizations.

15.
EClinicalMedicine ; 71: 102551, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38533128

RESUMO

Background: To receive the best care, people share their health data (HD) with their health practitioners (known as sharing HD for primary purposes). However, during the past two decades, sharing for other (i.e., secondary) purposes has become of great importance in numerous fields, including public health, personalized medicine, research, and development. We aimed to conduct the first comprehensive overview of all studies that investigated people's HD sharing attitudes-along with associated barriers/motivators and significant influencing factors-for all data types and across both primary and secondary uses. Methods: We searched PubMed, MEDLINE, PsycINFO, Web of Science, EMBASE, and CINAHL for relevant studies published in English between database inception and February 28, 2023, using a predefined set of keywords. Studies were included, regardless of their design, if they reported outcomes related to attitudes towards sharing HD. We extracted key data from the included studies, including the type of HD involved and findings related to: HD sharing attitudes (either in general or depending on type of data/user); barriers/motivators/benefits/concerns of the study participants; and sociodemographic and other variables that could impact HD sharing behaviour. The qualitative synthesis was conducted by dividing the studies according to the data type (resulting in five subgroups) as well as the purpose the data sharing was focused on (primary, secondary or both). The Newcastle-Ottawa Scale (NOS) was used to assess the quality of non-randomised studies. This work was registered with PROSPERO, CRD42023413822. Findings: Of 2109 studies identified through our search, 116 were included in the qualitative synthesis, yielding a total of 228,501 participants and various types of HD represented: person-generated HD (n = 17 studies and 10,771 participants), personal HD in general (n = 69 studies and 117,054 participants), Biobank data (n = 7 studies and 27,073 participants), genomic data (n = 13 studies and 54,716 participants), and miscellaneous data (n = 10 studies and 18,887 participants). The majority of studies had a moderate level of quality (83 [71.6%] of 116 studies), but varying levels of quality were observed across the included studies. Overall, studies suggest that sharing intentions for primary purposes were observed to be high regardless of data type, and it was higher than sharing intentions for secondary purposes. Sharing for secondary purposes yielded variable findings, where both the highest and the lowest intention rates were observed in the case of studies that explored sharing biobank data (98% and 10%, respectively). Several influencing factors on sharing intentions were identified, such as the type of data recipient, data, consent. Further, concerns related to data sharing that were found to be mutual for all data types included privacy, security, and data access/control, while the perceived benefits included those related to improvements in healthcare. Findings regarding attitudes towards sharing varied significantly across sociodemographic factors and depended on data type and type of use. In most cases, these findings were derived from single studies and therefore warrant confirmations from additional studies. Interpretation: Sharing health data is a complex issue that is influenced by various factors (the type of health data, the intended use, the data recipient, among others) and these insights could be used to overcome barriers, address people's concerns, and focus on spreading awareness about the data sharing process and benefits. Funding: None.

16.
Artigo em Inglês | MEDLINE | ID: mdl-38509403

RESUMO

Population neuroscience aims to advance our understanding of how genetic and environmental factors influence brain development and brain health over the life span, by integrating genomics, epidemiology, and neuroscience at population scale. This big data approach depends on data sharing strategies at both the micro- and macro-level, as well as attention to effective data management and protection of participant privacy. At the micro-level, researchers participate in international consortia that support collaboration, standards, and data sharing. They also seek to link together cohort studies, administrative health databases, and measures of the physical, built, and social environment in creative ways. Large-scale, longitudinal, and multi-modal cohorts are being designed to support explorations of genetic and environmental impacts on the brain. At a macro-level, funding agency policies now require data across health research domains to be managed according to the FAIR (findable, accessible, interoperable, and re-useable) Data principles and made available to the research community in a timely manner to support reproducibility and re-use. Data repositories provide technical infrastructure for storing, accessing, and increasingly also analyzing rich population-level data. Federated and cloud-based approaches are being leveraged to improve the security, remote accessibility, and performance of repositories. Finally, legal frameworks are being developed to facilitate secure health data access, integration, and analysis, providing new opportunities for the field.

17.
J Bus Ethics ; 190(3): 649-659, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38487176

RESUMO

Data access and data sharing are vital to advance medicine. A growing number of public private partnerships are set up to facilitate data access and sharing, as private and public actors possess highly complementary health data sets and treatment development resources. However, the priorities and incentives of public and private organizations are frequently in conflict. This has complicated partnerships and sparked public concerns around ethical issues such as trust, justice or privacy-in turn raising an important problem in business and data ethics: how can ethical theory inform the practice of public and private partners to mitigate misaligned incentives, and ensure that they can deliver societally beneficial innovation? In this paper, we report on the development of the Swiss Personalized Health Network's ethical guidelines for health data sharing in public private partnerships. We describe the process of identifying ethical issues and engaging core stakeholders to incorporate their practical reality on these issues. Our report highlights core ethical issues in health data public private partnerships and provides strategies for how to overcome these in the Swiss health data context. By agreeing on and formalizing ethical principles and practices at the beginning of a partnership, partners and society can benefit from a relationship built around a mutual commitment to ethical principles. We present this summary in the hope that it will contribute to the global data sharing dialogue.

18.
Eur J Intern Med ; 2024 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-38461061
19.
J Health Psychol ; : 13591053241239109, 2024 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-38549221

RESUMO

Qualitative research plays a pivotal role in health psychology, offering insights into the intricacies of health-related issues. However, the specificity of qualitative methodology presents challenges in adhering to standard open science principles, including data sharing. The guidelines to address these issues are limited. Drawing from the author's experience in conducting in-depth interviews with middle-aged and older adults regarding their sexuality, this article discusses various challenges in implementing data sharing requirements. It emphasizes factors like participants' reasonable reluctance to share in specific populations, the depth of personal information gleaned from comprehensive interviews, concerns surrounding potential data misuse both within and outside academic circles, and the complex issue of obtaining informed consent. A universal approach to data sharing in qualitative research proves impractical, emphasizing the necessity for adaptable, context-specific guidelines that acknowledge the methodology's nuances. Striking a balance between transparency and ethical responsibility requires tailored strategies and thoughtful consideration.

20.
J Am Med Inform Assoc ; 31(5): 1135-1143, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38457282

RESUMO

OBJECTIVES: Clinical trial data sharing is crucial for promoting transparency and collaborative efforts in medical research. Differential privacy (DP) is a formal statistical technique for anonymizing shared data that balances privacy of individual records and accuracy of replicated results through a "privacy budget" parameter, ε. DP is considered the state of the art in privacy-protected data publication and is underutilized in clinical trial data sharing. This study is focused on identifying ε values for the sharing of clinical trial data. MATERIALS AND METHODS: We analyzed 2 clinical trial datasets with privacy budget ε ranging from 0.01 to 10. Smaller values of ε entail adding greater amounts of random noise, with better privacy as a result. Comparison of rates, odds ratios, means, and mean differences between the original clinical trial datasets and the empirical distribution of the DP estimator was performed. RESULTS: The DP rate closely approximated the original rate of 6.5% when ε > 1. The DP odds ratio closely aligned with the original odds ratio of 0.689 when ε ≥ 3. The DP mean closely approximated the original mean of 164.64 when ε ≥ 1. As ε increased to 5, both the minimum and maximum DP means converged toward the original mean. DISCUSSION: There is no consensus on how to choose the privacy budget ε. The definition of DP does not specify the required level of privacy, and there is no established formula for determining ε. CONCLUSION: Our findings suggest that the application of DP holds promise in the context of sharing clinical trial data.


Assuntos
Pesquisa Biomédica , Privacidade , Disseminação de Informação/métodos , Consenso
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...